Method And A System For Determining The Location Of An Object

ABSTRACT

A method for determining the location of a transmitter (respectively a receiver) in a space defined by one or more reflective surfaces, including the steps of ending a signal from the transmitter (respectively from a set of transmitters); receiving by a set of receivers (respectively by a receiver) the transmitted signal and echoes of the transmitted signal reflected by the reflective surfaces; finding by a first computing module the location of the virtual sources (respectively virtual receivers) of the echoes; mirroring by a second computing module the virtual sources (respectively virtual receivers) into the space and obtained mirrored virtual sources (respectively mirrored virtual receivers); combining by a third computing module the mirrored virtual sources (respectively mirrored virtual receivers) so as to obtain location of the transmitter (respectively the receiver). This method makes use of echoes for localizing the source (respectively receiver) when there is no line of sight between the transmitter(s) and the receiver(s).

REFERENCE DATA

The present invention claims the priority of the PCT Patent Application PCT/EP2013/077694, filed on Dec. 20, 2013 and published under the number WO2014096364, the content of which is incorporated here by reference, and of the US provisional patent application US20130919145 filed on Dec. 20, 2013, the content of which is incorporated here by reference as well.

FIELD OF THE INVENTION

The present invention concerns a method and a system for determining the location of an object as a receiver or a transmitter, e.g. a microphone, a loudspeaker, a light source, a camera, a photo-diode, a smartphone, a household robot, a person, a neuron, etc.

DESCRIPTION OF RELATED ART

Most audio sensor array applications rely on the precise knowledge of the microphone positions. This motivated the development of several approaches for localization of microphones in an array.

For example, P. Pertila, M. Mieskolainen, and M. Hamalainen, “Closed-form self-localization of asynchronous microphone arrays,” in Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA), 2011, pp. 139-144 describes a closed-form method for calculating the relative geometry of multiple microphone arrays with known shapes.

V. C. Raykar and R. Duraiswami, “Automatic position calibration of multiple microphones,” in IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2004, vol. 4, pp. 69-72 describes a maximum-likelihood approach used to find the positions of microphones in an array.

Multidimensional scaling is used to solve a similar problem in S. Birchfield and A. Subramanya, “Microphone array position calibration by basis-point classical multidimensional scaling,” IEEE Transactions on Speech and Audio Processing, vol. 13, no. 5, pp. 1025-1034, 2005.

N. D. Gaubitch, W. B. Kleijn, and R. Heusdens, “Autolocalization in ad-hoc microphone arrays,” in IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), May 2013, pp. 106-110 describes an optimization approach to self-localization of ad-hoc arrays. The solution does not require synchronization between the sources and the array.

A characterization of cases when the solution exists as well as a minimal solver is described in Y. Kuang, S. Burgess, A. Torstensson, and K. Astrom, “A complete characterization and solution to the microphone position self-calibration problem,” in IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), May 2013, pp. 3875-3879.

All of the above approaches involve multiple sources and receivers. Furthermore, the methods are independent of the fact that the localization is performed indoors. On the contrary, the reverberation is even considered detrimental.

It is an aim of the present invention to obviate or mitigate one or more of the aforementioned disadvantages.

BRIEF SUMMARY OF THE INVENTION

According to the invention, these aims are achieved by means of a method for determining the location of an object, comprising the steps of

-   sending a signal with one transmitter; -   receiving by one receiver this signal and echoes of the transmitted     signal reflected by one or more reflective surfaces; -   associating some of these echoes to the one or more reflective     surfaces; -   determining the location of the object on the basis of this     association, the object being the transmitter or the receiver.

In contrast to the known approaches, the method according to the invention is a single-channel method for object localization, as it uses one receiver and one transmitter.

Unlike to known approaches, the method according to the invention takes advantage of the room reverberation, which enables to use only a single fixed receiver respectively transmitter to localize the transmitter respectively the receiver.

According to one embodiment, it is not necessary to know the location of the source, and it can be inferred from the measurements. According to another embodiment, the location of the source is known.

The method according to the invention uses an echo labeling approach that associates the echoes to the correct reflective surfaces, e.g. the walls of a room.

According to one embodiment, the steps of associating some of these echoes to the one or more reflective surfaces comprises:

-   building with a computing module a Euclidean distance matrix, -   adding to this Euclidean distance matrix a new row and a new column,     the new row and the new column corresponding to a combination of     some echoes, and obtaining a modified matrix, -   computing the distance between the modified matrix from a true     Euclidean distance matrix; -   determining the location of the object based on this distance.

The method according to the invention in fact leverages the properties of the Euclidean distance matrix (EDM) for associating echoes recorded by the receiver to the reflective surfaces that generated them.

Echo labeling in other words is performed with the help of Euclidean distance matrices. The EDMs are used as a filter that reveals the correct combinations of echoes.

Advantageously the location and/or the orientation of the reflective surfaces are known.

If the transmitter (or source) and the receiver are synchronized, then the minimum number of needed reflective surfaces is four, provided that the space defined by the reflective surfaces, e.g. a room, is convex, i.e. there is a direct path between the transmitter and the receiver, and provided that EDMs are used.

If mathematic tools different from the Euclidean distance matrix (e.g. multilateration) are used for associating echoes recorded by the receiver to the reflective surfaces that generated them, the minimum number of needed reflective surfaces is three, provided that the transmitter and the receiver are synchronized, and provided that the space defined by the reflective surfaces, e.g. a room, is convex.

If the transmitter (or source) and the receiver are not synchronized, then the minimum number of needed reflective surfaces is four, provided that mathematic tools different from the Euclidean distance matrix (e.g. multilateration) are used.

In a first preferred embodiment of the method according to the invention, the Euclidean distance matrix comprises the distances between the transmitter and the first order image sources. In this first embodiment, the method according to the invention allows to determine the (unknown) location of the receiver. In a variant, it is not necessary to know the location of the transmitter, and it can be inferred from the measurements. In another variant, the location of the transmitter is known.

In the context of the present invention, the expression “first order image source” indicates an image source of an echo of the first order. According to the known image source model, an echo from a reflective surface is replaced by a virtual source (or “image source”) behind the reflective surface in a mirrored location of the original (and real) source.

In a second preferred embodiment of the method according to the invention, the Euclidean distance matrix comprises the distances between the receiver and the first order image receivers. In this second embodiment, the method according to the invention allows to determine the (unknown) location of the transmitter. In a variant, it is not necessary to know the location of the receiver, and it can be inferred from the measurements. In another variant, the location of the receiver is known.

According to a preferred embodiment, only first order echoes are considered. These first order echoes could be determined by considering the echoes received during a predetermined time window. This time window could depend on the location and/or the orientation of the reflective surfaces, and on the location of the transmitter respectively receiver. According to another embodiment, also higher order echoes are considered.

The method according to the invention then comprises a step of labeling echoes, i.e. determining which of the peaks of the impulse response received by the receiver correspond to which reflective surface.

The method according to the invention checks if the modified matrix still verifies the rank property according which a Euclidian distance matrix built from objects in R^(n) has a rank at most n+2, n being an integer and positive number.

In other words, the method according to the invention tests at least some echo combinations and selects the combination for which the rank property is satisfied.

According to an embodiment, the method according to the invention comprises multi-dimensional scaling. In particular, it could apply an s-stress criterion.

The present invention concerns also a system for determining the location of an object, this system comprising:

-   one transmitter for sending a signal; -   one receiver for receiving the transmitted signal and the echoes of     the transmitted signals as reflected by one or more reflective     surfaces; -   a first computing module for associating some of these echoes to the     one or more reflective surfaces; -   a second computing module for determining the location of the object     on the basis of this association, the object being the transmitter     or the receiver.

In one embodiment the first computing module is configured for:

-   building a Euclidean distance matrix; -   adding to this Euclidean distance matrix a new row and a new column,     the new row and the new column corresponding to a combination of     some echoes, and obtaining a modified matrix; -   computing the distance between the modified matrix from a true     Euclidean distance matrix.

In one embodiment, the second computing module is configured for determining the location of the object based on the computed information, i.e. based on the computed distance.

In one preferred embodiment, the first and second modules are the same module.

In one preferred embodiment, the receiver is a microphone, and the transmitter is a loudspeaker. The microphone and/or the loudspeaker could belong to a device, e.g. a mobile device, e.g. a smartphone or a tablet.

The reflective surfaces could be the walls of a room, e.g. a convex room or a non-convex room.

In another preferred embodiment, the transmitter is a light source, e.g. a laser, a LED, etc., and the receiver is a light sensitive device as a photo diode or a camera. The reflective surfaces could be mirrors.

Experiments performed by the applicant have demonstrated the effectiveness, the accuracy and the robustness of the proposed method and system.

The present invention concerns also a computer program product for determining location of an object, comprising:

a tangible computer usable medium including computer usable program code being used for:

-   associating some of said echoes of a signal transmitted by a     transmitter and received by a receiver, as reflected by one or more     reflective surfaces, to these reflective surfaces; -   determining the location of said object on the basis of said     association, said object being said transmitter or said receiver.

The present invention concerns also a computer data carrier storing presentation content created with the described method.

BRIEF DESCRIPTION OF THE DRAWINGS

The invention will be better understood with the aid of the description of an embodiment given by way of example and illustrated by the figures, in which:

FIG. 1 shows a top view of a room comprising a source or transmitter and a receiver.

FIG. 2 shows a perspective view of a room comprising a source or transmitter and a receiver, and an example of the image source model for the first and second order echoes.

FIG. 3A shows a room comprising a transmitter and a receiver, and FIG. 3B shows an example of echoes received by the receiver of FIG. 3A from the walls of the room of FIG. 3A.

FIG. 4 shows a perspective view of a room comprising a transmitter and a receiver, and an example of the image source model.

FIG. 5 shows a perspective view of a room in which the inventive method has been applied.

FIG. 6 illustrates an embodiment of a data processing system in which a method in accordance with an embodiment of the present invention may be implemented.

DETAILED DESCRIPTION OF POSSIBLE EMBODIMENTS OF THE INVENTION

The present invention will be now described in more detail in connection with its embodiment for determining the location of a microphone (or, in general, of a receiver) by knowing the geometry of a room, i.e. the location and the orientation of its walls (or, in general, of its reflective surfaces), and the location of a loudspeaker (or, in general, of a transmitter). However, the present invention finds applicability of connection with many other fields, as will be discussed. For example, the described method and system can be used for determining the location of a transmitter by knowing the geometry of a room, i.e. the location and the orientation of its walls or reflective surfaces, and the location of a receiver.

The present invention will be now described in more detail in connection with an audio signal. However, the present invention finds applicability of connection with other kinds of signals, e.g. and in a non-limiting way a light signal, a RF signal, an UWB signal, an ultrasound signal, etc.

The present invention will be now described in more detail in connection with a room. However, the present invention does not necessarily need to be applied in a closed room.

The first and second order echoes concept is described in FIG. 1.

FIG. 1 illustrates a top view of a room defined by the walls W1, W2 and by other walls not represented and comprising a source or transmitter S and a receiver R. The source can be for example and in a non-limitative way a loudspeaker and the receiver a microphone. The walls W1, W2 are reflective surface, i.e. a surface allowing a signal to be reflected, the angle at which the signal is incident on this surface being equal to the angle at which it is reflected.

A first audio signal transmitted by the source S is reflected by the wall W2. The reflected signal or echo e1 is then received by the receiver R. Since there is a single reflection of the transmitted signal before its reception by the receiver R, the echo e1 is a first-order echo. A second audio signal transmitted by the source S is reflected first by the wall W2 and after by the wall W1: the reflected signal or echo e2 is then received by the receiver R. Since there are two reflections of the transmitted signal before its reception by the receiver R, the echo e2 is a second-order echo.

The time of arrival (TOA) is defined as the travel time from a source S to a receiver R. The audio signals e1 and e2 can have different times of arrival (TOAs).

In the context of the present invention the expression “time of arrival” or “TOA” indicates the absolute propagation time of an echo between the transmitter and the receiver, or the difference of the time of arrival of an echo from the time of arrival of another echo (reference). In the first case the transmitter and the receiver are synchronised, in the second case they are not synchronized.

FIG. 2 shows a perspective view of a room 100 of known dimensions and shape, and having a section corresponding to a K-faced polygon. As the room shape is known, the location of wall vertices p_(i) ∈ R³ is available. In the room 100 of FIG. 2 there is one loudspeaker with known location s ∈ R³.

The sound propagation inside the room 100 can be modeled by the room impulse response (RIR). An RIR describes the acoustic channel between the source s and the receiver r inside the room 100. It depends on the shape of the room 100 and locations of the loudspeaker s and the microphone r. Ideally, it is a train of Diracs, each corresponding to an echo:

${h(t)} = {\sum\limits_{i}\; {c_{i}{\delta \left( {t - t_{i}} \right)}}}$

where c_(i) and t_(i) are the amplitude and time of arrival of the ith echo.

The loudspeaker s does not need to be synchronized with the microphone r, as it is possible to only measure differences of times of arrival of the echoes to the microphone r due to the lack of synchronization.

If the loudspeaker s and the microphone r are synchronised, the time of arrival corresponds to the absolute propagation times of the signal between the loudspeaker s and the microphone r.

The microphone r hears the convolution of the signal transmitted by the loudspeaker s with the RIR. By measuring the RIR it is possible to access the echo times t_(i). These echo times can be linked to the room geometry and the microphone location with the image source model. According to this model, it is possible to replace an echo from a wall by a virtual source behind the wall in a mirrored location of the original source.

As illustrated in FIG. 2, virtual sources {tilde over (s)} are mirror images of the true source s across the corresponding reflecting walls i, j. The image {tilde over (s)}_(i) of the source s with respect to the ith wall is computed as

{tilde over (s)} _(i) =s+2(p _(i) −s, n _(i))n _(i),  (1)

where n_(i) is the unit normal to the ith wall. The time of arrival of the echo from the ith wall is t_(i)=∥{tilde over (s)}_(i)−r∥/c, where c is the speed of sound and r is the location of the microphone.

Assuming that the sound speed inside the room is fixed and known, it is possible to relate the time of arrival of the echoes to the mutual distances of the microphone and the image sources.

As the geometry of the room 100 and the location of the loudspeaker s are known, the location of the image sources s can be determined by formula (1).

In order to be able to find the location of the microphone r, it is necessary to know the correspondences of the echoes recorded by the microphone r with the image sources s. In other words, it is necessary to know which echo comes from which wall.

There are however two main problems:

-   not all the extracted echoes from the impulse response correspond to     first order image sources, -   the echoes arrive to the microphone r in different orders based on     the location of the microphone r.

FIG. 3B illustrates an example of echoes measurement made in the room illustrated in FIG. 3A, comprising the walls N, W, S and E, the floor F and the ceiling C (not illustrated for sake of clarity). As can be seen from FIG. 3B, many of the extracted peaks in the impulse response do not correspond to a valid image source.

Advantageously the method according to the invention applies a step of echo labeling for solving the aforementioned problems.

With the echo labeling procedure it is possible:

-   first extracting the correct echoes from the impulse response, and -   second finding the right assignment of these echoes to the walls.

If we consider the setup of FIG. 4, let D ∈ R^((K+1)×(K+1)) be a matrix whose entries are as follows:

$\begin{matrix} {{d\left\lbrack {i,j} \right\rbrack} = \left\{ {\begin{matrix} {{s - {\overset{\sim}{s}}_{j}}}^{2} & {i = 1} \\ {{{\overset{\sim}{s}}_{i} - s}}^{2} & {j = 1} \\ {{{\overset{\sim}{s}}_{i - 1} - {\overset{\sim}{s}}_{j - 1}}}^{2} & {{2 \leq i},{j \leq {K + 1}}} \end{matrix},} \right.} & (2) \end{matrix}$

where {tilde over (s)}_(i) are the locations of the first order image sources.

As the geometry of the room 100 and the location of the loudspeaker s are known, D is a Euclidean distance matrix (EDM) with known entries.

As the loudspeaker s emits a sound, the microphone r receives the direct sound (the first peak in its RIR), K first order echoes from the walls (consecutive peaks in its RIR) and a number of higher order echoes too. In one embodiment only the first order echoes are considered. However, the method according to the invention is not limited to the use of first order echoes only, as higher order echoes could be used as well. In one preferred embodiment, a time window allows to select the first order echoes.

The method according to the invention allows to extract these echoes from the RIR and label them according to their corresponding reflective surface (i.e. the wall).

To this end, a fundamental property of EDMs is used: an EDM corresponding to a point set in R^(n) has rank at most n+2. Thus, in 3D its rank is at most 5.

The Euclidian distance matrix D is then augmented as follows: (K+1) echoes are chosen from the RIR of the microphone s, and the Euclidian distance matrix D is augmented with these (K+1) echoes, by adding an extra column and row to D, Then a modified or augmented matrix is obtained.

If these (K+1) echoes are correctly assigned to the image sources, then they represent the distances of the microphone r from these image sources s and the augmented matrix D_(aug) is an EDM and thus will be low rank, i.e. matrix corresponding to a point set in R^(n) and having a rank at most n+2.

However, if these (K+1) echoes are not correctly selected or they do not have the right permutation, then the augmented matrix will not be an EDM.

Here below there are two examples of two modified or augmented matrices. The grey part of the matrices show the starting Euclidian Distance Matrix D comprising the distances between the source s and the image sources The matrix D has been augmented with a first respectively second combination of echoes extracted from the microphone RIR, and the matrix D_(aug,1) and D_(aug,2) has been obtained.

$D_{{aug},1} = \overset{\begin{matrix} S & {\overset{\sim}{S}}_{1} & {\overset{\sim}{S}}_{2} & {\overset{\sim}{S}}_{3} & {\overset{\sim}{S}}_{4} & {\overset{\sim}{S}}_{5} & {\overset{\sim}{S}}_{6} \end{matrix}}{\begin{bmatrix} 0.00 & 1.00 & 1.44 & 0.64 & 2.56 & 5.76 & 1.00 & 0.12 \\ 1.00 & 0.00 & 2.44 & 1.64 & 3.56 & 6.76 & 4.00 & 1.52 \\ 1.44 & 2.44 & 0.00 & 4.00 & 4.00 & 7.20 & 2.44 & 2.04 \\ 0.64 & 1.64 & 4.00 & 0.00 & 3.20 & 6.40 & 1.64 & 0.44 \\ 2.56 & 3.56 & 4.00 & 3.20 & 0.00 & 16.0 & 3.56 & 3.32 \\ 5.76 & 6.76 & 7.20 & 6.40 & 16.0 & 0.00 & 6.76 & 4.92 \\ 1.00 & 4.00 & 2.44 & 1.64 & 3.56 & 6.76 & 0.00 & 0.72 \\ 0.12 & 1.52 & 2.04 & 0.44 & 3.32 & 4.92 & 0.72 & 0.00 \end{bmatrix}}$ $D_{{aug},2} = \overset{\begin{matrix} S & {\overset{\sim}{S}}_{1} & {\overset{\sim}{S}}_{2} & {\overset{\sim}{S}}_{3} & {\overset{\sim}{S}}_{4} & {\overset{\sim}{S}}_{5} & {\overset{\sim}{S}}_{6} \end{matrix}}{\begin{bmatrix} 0.00 & 1.00 & 1.44 & 0.64 & 2.56 & 5.76 & 1.00 & 4.92 \\ 1.00 & 0.00 & 2.44 & 1.64 & 3.56 & 6.76 & 4.00 & 1.52 \\ 1.44 & 2.44 & 0.00 & 4.00 & 4.00 & 7.20 & 2.44 & 2.04 \\ 0.64 & 1.64 & 4.00 & 0.00 & 3.20 & 6.40 & 1.64 & 0.44 \\ 2.56 & 3.56 & 4.00 & 3.20 & 0.00 & 16.0 & 3.56 & 3.32 \\ 5.76 & 6.76 & 7.20 & 6.40 & 16.0 & 0.00 & 6.76 & 0.12 \\ 1.00 & 4.00 & 2.44 & 1.64 & 3.56 & 6.76 & 0.00 & 0.72 \\ 4.92 & 1.52 & 2.04 & 0.44 & 3.32 & 0.12 & 0.72 & 0.00 \end{bmatrix}}$

As discussed, if the echoes are selected correctly and have the right order, then the augmented matrix is an EDM. The matrix D_(aug,1) is an EDM. But since the echoes are not correctly ordered in D_(aug,2), D_(aug,2) is not an EDM. For example, the echoes indicated by a rectangle do not appear in the correct order in the matrix.

In other words, since D_(aug,1) contains the correct permutation of the echoes, it is an EDM; since D_(aug,2) does not contain the correct permutation of the echoes, it is not an EDM.

More formally, let e list the candidate distances computed from the RIR recorded by the microphone r. The matrix D is then augmented with a combination of K unlabelled squared distances d_((i1; . . . ; iK)) to get D_(aug) as follows:

${D_{aug}\left( d_{({i_{1},\ldots \mspace{14mu},i_{K}})} \right)} = \begin{bmatrix} 0 & d_{({i_{1},\ldots \mspace{14mu},i_{K}})} \\ d_{({i_{1},\ldots \mspace{14mu},i_{K}})}^{T} & 0 \end{bmatrix}$

The column vector d_((i1; . . . ; iK)) is constructed as

d _((i) ₁ _(, . . . , i) _(K) ₀ [k]=e ² [i _(k)]

with i_(k) ∈ {1; . . . ; length(e)}. In words, a candidate combination of echoes d is constructed by selecting K echoes out of all extracted echoes from the microphone RIR.

In general length(e)≠K, meaning that it would be possible to pick more than K echoes from the RIR of the microphone.

In general length(e)≠K, meaning that it would be possible to choose more than K echoes from the RIR of the microphone, out of which one can try permutations of length K in the augmented Euclidean distance matrix. Thus e can contain first or higher order echoes, as well as wrongly picked peaks in the RIR.

If rank(D_(aug))≦5 or more specifically D_(aug) verifies the EDM properties, then the selected combination of echoes is the correct permutation.

Both the measurements for D and e are often noisy. Instead of checking if the augmented matrix is a EDM, it is possible to check how close the augmented matrix D_(aug) is to an EDM. Multi-dimensional scaling (MDS) is used to define such measure of closeness. MDS tries to find the point set in a given dimension (e.g. three-dimension) that produces an EDM closest to D_(aug).

In one embodiment, the s-stress criterion is used. For each selection of echoes that results in {tilde over (D)}_(aug),s-stress ({tilde over (D)}_(aug)) is the value of the following optimization formula:

$\begin{matrix} {\underset{D_{aug} \in {}^{(2)}}{minimize}{\sum\limits_{ij}\; {\left( {{D_{aug}\left\lbrack {i,j} \right\rbrack} - {{\overset{\sim}{D}}_{aug}\left\lbrack {i,j} \right\rbrack}} \right)^{2}.}}} & (3) \end{matrix}$

wherein EDM⁽³⁾ denotes the set of EDMs generated by point sets in R³. The s-stress ({tilde over (D)}_(aug)) is the score of the matrix {tilde over (D)}_(aug) used to assess the likelihood that a permutation of echoes is correct.

For optimizing the formula (3), it is possible to use a method allowing in almost every case to find the global minimum of the s-stress function. According to this method, the combination of echoes that results in the minimum value for the s-stress score is selected as for finding the correct permutation.

Here below an example of a method for finding the echoes' correct permutation:

-   -   i. For every d_((i) ₁ _(. . . , i) _(K) ₎.

score [d _((i) ₁ _(. . . , i) _(K) ₎ ]←s-stress({tilde over (D)} _(aug))

-   -   ii. Find the minimum score collected in score,     -   iii. Use the found echo combination and the image source         locations to compute the microphone location.

Although the method according to the invention needs to check echo combinations and permutations, in one preferred embodiment it is not necessary to test all echo combinations. The dimensions of the room together with the location of the loudspeaker define the size of a temporal window in which all the first order echoes lie.

FIG. 5 illustrates a sketch of a room where the method according to the invention has been applied. The image sources of the loudspeaker are shown with stars. The image source of the floor ({tilde over (s)}₄) is not shown for better visualization.

The room dimensions are known a-priori and the loudspeaker location was measured during the experiment. As the loudspeaker s is placed against a wall, the image source for this wall has not been considered. The matrix D —defined in formula (2)—is

$D \approx {\begin{bmatrix} 0.00 & 25.40 & 178.48 & 5.91 & 4.66 & 10.38 \\ 25.40 & 0.00 & 203.90 & 55.40 & 30.07 & 35.77 \\ 178.48 & 203.90 & 0.00 & 172.38 & 183.15 & 188.86 \\ 5.91 & 55.40 & 172.38 & 0.00 & 10.58 & 16.28 \\ 4.66 & 30.07 & 183.15 & 10.58 & 0.00 & 28.94 \\ 10.38 & 35.77 & 188.86 & 16.28 & 28.94 & 0.00 \end{bmatrix}.}$

This matrix D has been augmented with 6-tuples of echoes selected from the microphone's RIR. For each combination the value of s-stress(D_(aug)) has been calculated. The combination that results in the minimum score is selected as the correct combination and the microphone location is found using the estimated permutation of the echoes.

The actual distance of the loudspeaker and the microphone is 3.684 m and the estimated distance is 3.680 m. Then the distance of the microphone from the loudspeaker has been estimated in this example with an error of less than 1 cm.

The method according to the invention, which uses Euclidean distance matrices to detect the correct echo combinations, can localize the microphone in a realistic scenario with the positioning error in the order of a cm in a room whose sides are several meters long.

In one embodiment, the method according to the invention could be applied to rooms with more general geometries (e.g. non-convex).

In another embodiment, the method according to the invention could be used for performing joint source-microphone localization.

In another embodiment, the method according to the invention could be integrated within a comprehensive indoor localization system.

FIG. 6 is an embodiment of a data processing system 300 in which an embodiment of a method of the present invention may be implemented. The data processing system 300 of FIG. 6 may be located and/or otherwise operate at any node of a computer network, that may exemplarily comprise clients, servers, etc., and it is not illustrated in the Figure. In the embodiment illustrated in FIG. 6, data processing system 300 includes communications fabric 302, which provides communications between processor unit 304, memory 306, persistent storage 308, communications unit 310, input/output (I/O) unit 312, and display 314.

Processor unit 304 serves to execute instructions for software that may be loaded into memory 306. Processor unit 304 may be a set of one or more processors or may be a multi-processor core, depending on the particular implementation. Further, processor unit 304 may be implemented using one or more heterogeneous processor systems in which a main processor is present with secondary processors on a single chip. As another illustrative example, the processor unit 304 may be a symmetric multi-processor system containing multiple processors of the same type.

In some embodiments, the memory 306 shown in FIG. 6 may be a random access memory or any other suitable volatile or non-volatile storage device. The persistent storage 308 may take various forms depending on the particular implementation. For example, the persistent storage 308 may contain one or more components or devices. The persistent storage 308 may be a hard drive, a flash memory, a rewritable optical disk, a rewritable magnetic tape, or some combination of the above. The media used by the persistent storage 308 also may be removable such as, but not limited to, a removable hard drive.

The communications unit 310 shown in FIG. 6 provides for communications with other data processing systems or devices. In these examples, communications unit 310 is a network interface card. Modems, cable modem and Ethernet cards are just a few of the currently available types of network interface adapters. Communications unit 310 may provide communications through the use of either or both physical and wireless communications links.

The input/output unit 312 shown in FIG. 6 enables input and output of data with other devices that may be connected to data processing system 300. In some embodiments, input/output unit 312 may provide a connection for user input through a keyboard and mouse. Further, input/output unit 312 may send output to a printer. Display 314 provides a mechanism to display information to a user.

Instructions for the operating system and applications or programs are located on the persistent storage 308. These instructions may be loaded into the memory 306 for execution by processor unit 304. The processes of the different embodiments may be performed by processor unit 304 using computer implemented instructions, which may be located in a memory, such as memory 306. These instructions are referred to as program code, computer usable program code, or computer readable program code that may be read and executed by a processor in processor unit 304. The program code in the different embodiments may be embodied on different physical or tangible computer readable media, such as memory 306 or persistent storage 308.

Program code 316 is located in a functional form on the computer readable media 318 that is selectively removable and may be loaded onto or transferred to data processing system 300 for execution by processor unit 304. Program code 316 and computer readable media 318 form a computer program product 320 in these examples. In one example, the computer readable media 318 may be in a tangible form, such as, for example, an optical or magnetic disc that is inserted or placed into a drive or other device that is part of persistent storage 308 for transfer onto a storage device, such as a hard drive that is part of persistent storage 308. In a tangible form, the computer readable media 318 also may take the form of a persistent storage, such as a hard drive, a thumb drive, or a flash memory that is connected to data processing system 300. The tangible form of computer readable media 318 is also referred to as computer recordable storage media. In some instances, computer readable media 318 may not be removable.

Alternatively, the program code 316 may be transferred to data processing system 300 from computer readable media 318 through a communications link to communications unit 310 and/or through a connection to input/output unit 312. The communications link and/or the connection may be physical or wireless in the illustrative examples. The computer readable media also may take the form of non-tangible media, such as communications links or wireless transmissions containing the program code.

The different components illustrated for data processing system 300 are not meant to provide architectural limitations to the manner in which different embodiments may be implemented. The different illustrative embodiments may be implemented in a data processing system including components in addition to or in place of those illustrated for data processing system 300. Other components shown in FIG. 6 can be varied from the illustrative examples shown. For example, a storage device in data processing system 300 is any hardware apparatus that may store data. Memory 306, persistent storage 308, and computer readable media 318 are examples of storage devices in a tangible form.

Therefore, as explained at least in connection with FIG. 6 the present invention is as well directed to a system for determining the location of an object, a computer program product for determining the location of an object and a computer data carrier.

In accordance with a further embodiment of the present invention is provided for a computer data carrier storing presentation content created while employing the methods of the present invention.

Although the present invention has been described in more detail in connection with its embodiment for determining the location of a microphone, the present invention finds applicability of connection with many other fields.

It can be used for source localization using a single microphone, by knowing the location of the microphone and the geometry of the room.

The method according to the invention does not have to be applied in a room. It could be applied in a system comprising at least three reflective surfaces (in 3D) and a microphone (or a source in general, e.g. a light source or a UWB source) in the middle.

The method according to the invention could be applied in a non-convex room, if it is possible to know where the image sources are.

The signal could be a light signal, the transmitter a laser, a LED or a light source in general, the receiver could be a camera, a photo-diode or any other light sensitive device. The reflective surfaces could be mirrors.

The present method can be used for tracking the trajectory of a moving source (or receiver, e.g. a microphone), e.g. a source (or receiver) mounted on a household robot. So the present method can be used for tracking the trajectory of this robot.

The present method can be used for tracking the positions of mobile devices comprising a transmitter and/or a receiver, e.g. smartphones, tablets, glasses, etc., in a room.

The present method can be used for surrounding sound systems in known rooms, by exploiting a calibration microphone for understanding the position of the loudspeaker. 

1. A method for determining the location of an object, comprising the steps of sending a signal with one transmitter; receiving by one receiver said signal and echoes of the transmitted signal reflected by one or more reflective surfaces; associating some of said echoes to the one or more reflective surfaces; determining the location of said object on the basis of said association, said object being said transmitter or said receiver.
 2. The method of claim 1, said associating comprising: building with a computing module a Euclidean distance matrix, adding to said Euclidean distance matrix a new row and a new column, the new row and the new column corresponding to a combination of some echoes, and obtaining a modified matrix, computing the distance between the modified matrix from a true Euclidean distance matrix; determining the location of the object based on the computed distance.
 3. The method of claim 1, the location and/or the orientation of said one or more reflective surfaces being known.
 4. The method of claim 1, said Euclidean distance matrix comprising the distances between said transmitter and first order image sources, said object being said receiver.
 5. The method of claim
 4. the location of the transmitter being known.
 6. The method of claim 1, said Euclidean distance matrix comprising the distances between said receiver and first order image receivers, said object being said transmitter.
 7. The method of claim 6, the location of the receiver being known.
 8. The method of claim 1, wherein only first order echoes are considered.
 9. The method of claim 1, wherein only echoes received during a predetermined time window are considered for the association.
 10. The method of claim 1, comprising the step of labeling echoes.
 11. The method of claim 1, comprising determining which of the peaks of the impulse response received by said receiver correspond to which reflective surface.
 12. The method of claim 1, comprising the step of checking if the modified matrix still verify the rank property according which a Euclidian distance matrix in R^(n) has a rank at most n+2, n being an integer and positive number.
 13. The method of claim 12, comprising testing at least some echo combinations and selecting the combination for which the rank property is satisfied.
 14. The method of claim 1, comprising multi-dimensional scaling.
 15. The method of claim 14, comprising applying an s-stress criterion.
 16. The method of claim 1, said receiver being a microphone, said transmitter being a loudspeaker.
 17. The method of the previous claim 16, said reflective surface being a wall of a room.
 18. The method of claim 17, said room being a convex room.
 19. The method of the claim 17, said room being a non-convex room.
 20. The method of claim 1, said receiver being a light sensitive device as a photo diode or a camera, said transmitter being a light source.
 21. The method of claim 20, said reflective surface being a mirror.
 22. A system for determining the location of an object, comprising: one transmitter for sending a signal; one receiver for receiving the transmitted signal and the echoes of the transmitted signals as reflected by one or more reflective surfaces; a first computing module for associating some of said echoes to the one or more reflective surfaces; a second computing module for determining the location of said object on the basis of said association, said object being said transmitter or said receiver.
 23. The system of claim 22, wherein said first computing module is configured for building a Euclidean distance matrix: adding to said Euclidean distance matrix a new row and a new column, the new row and the new column corresponding to a combination of some echoes, and obtaining a modified matrix; computing the distance between the modified matrix from a true Euclidean distance matrix.
 24. The system of claim 23, wherein said second computing module is configured for determining the location of the object based on the computed distance.
 25. The system of claim 22, the first and second computing modules being the same module.
 26. The system of claim 22, said receiver being a microphone, said transmitter being a loudspeaker.
 27. The system of claim 26, said reflective surface being a wall of a room.
 28. The system of claim 27, said room being a convex room.
 29. The system of the claim 27, said room being a non-convex room.
 30. The system of claim 22, said receiver being a light sensitive device as a photo diode or a camera, said transmitter being a light source.
 31. The system of claim 30, said reflective surface being a mirror.
 32. A computer program product, comprising: a tangible computer usable medium including computer usable program code for determining the location of an object, the computer usable program code being used for associating some of said echoes of a signal transmitted by a transmitter and received by a receiver, as reflected by one or more reflective surfaces, to these reflective surfaces; determining the location of said object on the basis of said association, said object being said transmitter or said receiver. 